OPE 


“Conclusion 


Management of personal computers in the corporate environment requires 
maintaining a balance between control and encouragement. The goal is to 
address the problems raised by the proliferation of personal computers in the 
organization and, at the same time, to take advantage of their capabilities 
and future potential wherever possible. 


Our approach to managing personal computers at Equitable has been to put an 
organization in place to follow four basic steps; 


DISCOVER what the personal computers can do for us, 

INFORM the rest of Equitable how they can be used innovatively, 
INFECT management and staff with interest in using them, and 
CONTROL their acquisition centrally with standards and policy. 


This strategy has been effective for us, allowing us to support a growing 
personal computer user population and to develop numerous projects that use 
personal computers for communicating management information. 
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This talk is for people who are just getting started with the PC MACRO 
Assembler. Maybe you are just contemplating doing some coding in 
assembler, maybe you have tried it with mixed success. If you are here to 
get aimed in the right direction, to get off to a good start with the 
assembler, then you have come for the right reason. I can't promise you'll 
get what you want, but I'll do my best. 


On the other hand, if you have already turned out some working assembler 
code, then this talk is likely to be on the elementary side for you. If 
you want to review a few basics and have no where else pressing to go, then 
by all means stay. 


Why Learn Assembler? 


The reasons for LEARNING assembler are not the same as the reasons for 
USING it in a particular application. But, we have to start with some of 
the reasons for using it and then I think the reasons for learning it will 
become clear. 


First, let's dispose of a bad reason for using it. Don't use it just 
because you think it is going to execute faster. A particular sequence of 
ordinary bread-and-butter computations written in PASCAL, C, FORTRAN, or 
compiled BASIC can do the job just about as fast as the same algorithm 
coded in assembler. Of course, interpretive BASIC is slower, but if you 
have a BASIC application which runs too slow you probably want to try com- 
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piling it before you think too much about translating parts of it to 
another language. 


On the other hand, high level languages do tend to isolate you from the 
machine. That is both their strength and their weakness. Usually, when 
implemented on a micro, a high level language provides an escape mechanism 
to the underlying operating system or to the bare machine. So, for 
example, BASIC has its PEEK and POKE. But, the route to the bare machine 
is often a circuitous one, leading to tricky programming which is hard to 
follow. 


For those of us working on PC's connected to SHARE-class mainframes, we are 
generally concerned with three interfaces: the keyboard, the screen, and 
the communication line or lines. All three of these entities raise machine 
dependent issues which are imperfectly addressed by the underlying operat- 
ing system or by high level languages. 


Sometimes, the system or the language does too little for you. For 
example, with the asynch adapter, the system provides no interrupt handler, 
no buffer, and no flow control. The application is stuck with the respon- 
sibility for monitoring that port and not missing any characters, then 
deciding what to do with all errors. BASIC does a reasonable job on some 
of this, but that is only BASIC. Most other languages do less. 


Sometimes, the system may do too much for you. System support for the key- 
board is an example. At the hardware level, all 83 keys on the keyboard 
send unique codes when they are pressed, held down, and released. But, 
someone has decided that certain keys, like Num Lock and Scroll Lock are 
going to do certain things before the application even sees them and can't 
therefore be used as ordinary keys. 


Sometimes, the system does about the right amount of stuff but does it less 
efficiently then it should. System support for the screen is in this 
class. If you use only the official interface to the screen you sometimes 
slow your application down unacceptably. I said before, don't use assem- 
bler just to speed things up, but there I was talking about mainline code, 
which generally can't be speeded up much by assembler coding. A critical 
system interface is a different matter: sometimes we may have to use 
assembler to bypass a hopelessly inefficient implementation. We don't want 
to do this if we can avoid it, but sometimes we can't. 


Assembly language code can overcome these deficiencies. In some cases, you 
can also overcome these deficiencies by judicious use of the escape valves 
which your high level language provides. In BASIC, you can PEEK and POKE 
and INP and OUT your way around a great many issues. In many other lan- 
guages you can issue system calls and interrupts and usually manage, one 
way or other, to modify system memory. Writing handlers to take real-time 
hardware interrupts from the keyboard or asynch port, though, is still 
going to be a problem in most languages. Some languages claim to let you 
do it but I have yet to see an acceptably clean implementation done that 
way. 


The real reason while assembler is better than "tricky POKEs" for writing 


machine-dependent code, though, is the same reason why PASCAL is better 
than assembler for writing a payroll package: it is easier to maintain. 
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Let the high level language do what it does best, but recognize that there 
are some things which are best done in assembler code. The assembler, 
unlike the tricky POKE, can make judicious use of equates, macros, labels, 
and appropriately placed comments to show what is really going on in this 
machine-dependent realm where it thrives. 


So, there are times when it becomes appropriate to write in assembler; giv- 
en that, if you are a responsible programmer or manager, you will want to 
be "assembler-literate" so you can decide when assembler code should be 
written. 


What do I mean by "assembler-literate?" I don't just mean understanding 
the 8086 architecture; I think, even if you don't write much assembler code 
yourself, you ought to understand the actual process of turning out assem- 
bler code and the various ways to incorporate it into an application. You 
ought to be able to tell good assembler code from bad, and appropriate 
assembler code from inappropriate. 


Steps to becoming ASSEMBLER-LITERATE 


1. Learn the 8086 architecture and most of the instruction set. Learn 
what you need to know and ignore what you don't. Reading: The 8086 
Primer by Stephen Morse, published by Hayden. You need to read only 
two chapters, the one on machine organization and the one on the 
instruction set. 


2. Learn about a few simple DOS function calls. Know what services the 
operating system provides. If appropriate, learn a little about other 
systems too. It will aid portability later on. Reading: appendices D 
and E of the PC DOS manual. 


3. Learn enough about the MACRO assembler and the LINKer to write some 
simple things that really work. Here, too, the main thing is figuring 
out what you don't need to know. Whatever you do, don't study the sam- 
ple programs distributed with the assembler unless you have nothing 
better! 


4. At the same time as you are learning the assembler itself, you will 
need to learn a few tools and concepts to properly combine your assem- 
bler code with the other things you do. If you plan to call assembler 
subroutines from a high level language, you will need to study the 
interface notes provided in your language manual. Usually, this forms 
an appendix of some sort. If you plan to package your assembler rou- 
tines as .COM programs you will need to learn to do this. You should 
also learn to use DEBUG. 


5. Read the Technical Reference, but very selectively. The most important 
things to know are the header comments in the BIOS listing. Next, you 
will want to learn about the RS 232 port and maybe about the video 
adapters. 
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Notice that the key thing in all. five phases .is being selective. It is 
easy to conclude that there is too much to learn unless you can throw away 
what you don't need. Most of the rest of this talk is going to deal with 
this very important question of what you need and don't need to learn in 
each phase. In some cases, I will have to leave you to do almost all of 
the learning, in others, I will teach a few salient points, enough, I hope, 
to get you started. I hope you understand that all I can do in an hour is 
get you started on the way. 


Phase 1: Learn the architecture and instruction set 


The Morse book might seem like a lot of book to buy for just two really 
important chapters; other books devote a lot more space to the instruction 
set and give you a big beautiful reference page on each instruction. And, 
some of the other things in the Morse book, although interesting, really 
aren't very vital and are covered too sketchily to be of any real help. 

The reason I like the Morse book is that you can just read it; it has a 
very conversational style, it is very lucid, it tells you what you really 
need to know, and a little bit more which is by way of background; because 
nothing really gets belabored to much, you can gracefully forget the things 
you don't use. And, I very much recommend READING Morse rather than study- 
ing it. Get the big picture at this point. 


Now, you want to concentrate on those things which are worth fixing in mem- 
ory. After you read Morse, you should relate what you have learned to this 
outline. 


1. You want to fix in your mind the idea of the four segment registers 
CODE, DATA, STACK, and EXTRA. This part is pretty easy to grasp. The 
8086 and the 8088 use 20 bit addresses for memory, meaning that they 
can address up to 1 megabyte of memory. But, the registers and the 
address fields in.all the instructions are no more that 16 bits long. 
So, how to address all of that memory? Their solution is to put 
together two 16 bit quantities like this: 


calculation SSSSO -~--- value in the relevant segment register SHL 4 
depicted in AAAA ---- apparent address from register or instruction 
hexadecimal -------- 

RRRRR ---- real address placed on address bus 


In other words, any time memory is accessed, your program will supply a 
sixteen bit address. Another sixteen bit address is acquired from a 
segment register, left shifted four bits (one nibble) and added to it 
to form the real address. You can control the values in the segment 
registers and thus access any part of memory you want. But the segment 
registers are specialized: one for code, one for most data accesses, 
one for the stack (which we'll mention again) and one "extra" one for 
additional data accesses. 


Most people, when they first learn about this addressing scheme become 


obsessed with converting everything to real 20 bit addresses. After a 
while, though, you get use to thinking in segment/offset form. You 
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tend to get your segment registers set up at the beginning of the pro- 
gram, change them as little as possible, and think just in terms of 
symbolic locations in your program, as with any assembly language. 


EXAMPLE : 
MOV AX,DATASEG 
MOV DS,AX ;Set value of Data segment 


ASSUME DS :DATASEG ;Tell assembler DS is usable 


MOV AX,PLACE ;Access storage symbolically by 16 bit address 
In the above example, the assembler knows that no special issues are 
involved because the machine generally uses the DS register to complete 

a normal data reference. 


If you had used ES instead of DS in the above example, the assembler 
would have known what to do, also. In front of the MOV instruction 
which accessed the location PLACE, it would have placed the ES segment 
prefix. This would tell the machine that ES should be used, instead of 
DS, to complete the address. 


Some conventions make it especially easy to forget about segment regis- 
ters. For example, any program of the COM type gets control with all 
four segment registers containing the same value. This program exe- 
cutes in a simplified 64K address space. You can go outside this 
address space if you want but you don't have to. 


You will want to learn what other registers are available and learn 
their personalities: 


e AX and DX are general purpose registers. They become special only 
when accessing machine and system interfaces. 


° CX is a general purpose register which is slightly specialized for 
counting. 


« BX is a general purpose register which is slightly specialized for 
forming base-displacement addresses. 


e AX-DX can be divided in half, forming AH, AL, BH, BL, CH, CL, DH, 
Dia 


° SI and DI are strictly 16 bit. They can be used to form indexed 
addresses (like BX) and they are also used to point to strings. 


° SP is hardly ever manipulated. It is there to provide a stack. 


° BP is a manipulable cousin to SP. Use it to access data which has 
been pushed onto the stack. 


« Most sixteen bit operations are legal (even if unusual) when per- 
formed in SI, DI, SP, or BP. 
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You will want to learn the classifications of operations available 
WITHOUT getting hung up in the details of how 8086 opcodes are con- 
structed. 


8086 opcodes are complex. Fortunately, the assembler opcodes used to 
assemble them are simple. When you read a book like Morse, you will 
learn some things which are worth knowing but NOT worth dwelling on. 


a. 8086 and 8088 instructions can be broken up into subfields and bits 

' with names like R/M, MOD, S and W. These parts of the instruction 
modify the basic operation in such ways as whether it is 8 bit or 
16 bit, if 16 bit, whether all 16 bits of the data are given, 
whether the instruction is register to register, register to 
memory, or memory to register, for operands which are registers, 
which register, for operands which are memory, what base and index 
registers should be used in finding the data. 


b. Also, some instructions are actually represented by several differ- 
ent machine opcodes depending on whether they deal with immediate 
data or not, or on other issues, and there are some expedited forms 
which assume that one of the arguments is the most commonly used 
operand, like AX in the case of arithmetic. 


There is no point in memorizing any of this detail; just distill the 
bottom line, which is, what kinds of operand combinations EXIST in the 
instruction set and what kinds don't. If you ask the assembler to ADD 
two things and the two things are things for which there is a legal ADD 
instruction somewhere in the instruction set, the assembler will find 
the right instruction and fill in all the modifier fields for you. 


I guess if you memorized all the opcode construction rules you might 
have a crack at being able to disassemble hex dumps by eye, like you 
may have learned to do somewhat with 370 assembler. I submit to you 
that this feat, if ever mastered by anyone, would be in the same class 
as playing the "Minute Waltz" in a minute; a curiosity only. 


Here is the basic matrix you should remember: 
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Two operands: One operand: 
5 ie aaa | R 

Mi <-+-"R M 

R <-- R s x 

R|M <-- I 


R|M <-- S * 
S <-- R|IM * 


* -- data moving instructions (MOV, PUSH, POP) only 
S -- segment register (CS, DS, ES, SS) 
R -- ordinary register (AX, BX, CX, DX, SI, DI, BP, SP, 
AH, AL, BH, BL, CH, CL, DH, DL) 
M -- one of the following 
pure address 
[BX]+offset 
[BP]+offset 
any of the above indexed by SI 
any of the first three indexed by DI 


4. Of course, you want to learn the operations themselves. As I've -sug- 


gested, you want to learn the op codes as the assembler presents them, 
not as the CPU machine language presents them. So, even though there 
are many MOV op codes you don't need to learn them. Basically, here is 
the instruction set: 


a. Ordinary two operand instructions. These instructions perform an 
operation and leave the result in place of one of the operands. 
They are 


1) ADD and ADC -- addition, with or without including a carry from 
a previous addition 


2) SUB and SBB -- subtraction, with or without including a borrow 
from a previous subtraction 


3) CMP -- compare. It is useful to think of this as a subtraction 
with the answer being thrown away and neither operand actually 
changed 


4) AND, OR, XOR -- typical boolean operations 


5) TEST -- like an AND, except the answer is thrown away and nei- 
ther operand is changed. 


6) MOV -- move data from source to target 


7) LDS, LES, LEA -- some specialized forms of MOV with side 
effects 


b. Ordinary one operand instructions. These can take any of the oper- 
and forms described above. Usually, the perform the operation and 
leave the result in the stated place: 


1) INC -- increment contents 
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2) DEC -- decrement contents 


3) NEG -- twos complement 


4) NOT -- ones complement 
5) PUSH -- value goes on stack (operand location itself unchanged) 
6) POP -- value taken from stack, replaces current value 


Now you touch on some instructions which do not follow the general 
operand rules but which require the use of certain registers. The 
important ones are 


1) The multiply and divide instructions 


2) The "adjust" instructions which help in performing arithmetic 
on ASCII or packed decimal data 


3) The shift and rotate instructions. These have a restriction on 
the second operand: it must either be the immediate value 1 or 


the contents of the CL register. 


4) IN and OUT which send or receive data from one of the 1024 
hardware ports. 


5) CBW and CWD -- convert byte to word or word to doubleword by 
sign extension 


Flow of control instructions. These deserve study in themselves 
and we will discuss them a little more. They include 


1) CALL, RET -- call and return 
2) INT, IRET -- interrupt and return-from-interrupt 
3) JMP -- jump or "branch" 


4) LOOP, LOOPNZ, LOOPZ -- special (and useful) instructions which 
implement a counted loop similar to the 370 BCT instruction 


5) various conditional jump instructions 


String instructions. These implement a limited storage-to-storage 
instruction subset and are quite powerful. All of them have the 


property that 


1) The source of data is described by the combination DS and SI. 


2) The destination of data is described by the combination ES and 
DI. 


3) As part of the operation, the SI and/or DI register(s) is(are) 
incremented or decremented so the operation can be repeated. 
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They include 

1) CMPSB/CMPSW -- compare byte or word 

2) LODSB/LODSW -- load byte or word into AL or AX 

3) STOSB/STOSW -- store byte or word from AL or AX 

4) MOVSB/MOVSW -- move byte or word 

5) SCASB/SCASW -- compare byte or word with contents of AL or AX 

6) REP/REPE/REPNE -- a prefix which can be combined with any of 
the above instructions to make them execute repeatedly across a 
string of data whose length is held in CX. 

f. Flag instructions: CLI, STI, CLD, STD, CLC, STC. These can set or 
clear the interrupt (enabled) direction (for string operations) or 
carry flags. 

‘The addressing summary and the instruction summary given above masks a 

lot of annoying little exceptions. For example, you can't POP CS, and 

although the R <-- M form of LES is legal, the M <-- R form isn't etc. 
etc. My advice is 

a. Go for the general rules 

b. Don't try to memorize the exceptions 

c. Rely on common sense and the assembler to teach you about 
exceptions over time. A lot of the exceptions cover things you 


wouldn't want to do anyway. 


5. A few instructions are rich enough and useful enough to warrent careful 
study. Here are a few final study guidelines: 


a. It is well worth the time learning to use the string instruction 
set effectively. Among the most useful are 


REP MOVSB ;moves a string : 


REP STOSB ;initializes memory 
REPNE SCASB . ;look up occurance of character in string 
REPE CMPSB ;compare two strings 


b. Similarly, if you have never written for a stack machine before, 
you will need to exercise PUSH and POP and get very comfortable 
with them because they are going to be good friends. If you are 
used to the 370, with lots of general purpose registers, you may 
find yourself feeling cramped at first, with many fewer registers 
and many instructions having register restrictions. But, you have 
a hidden ally: you need a register and you don't want to throw 
away what's in it? Just PUSH it, and when you are done, POP it 
back. This can lead to abuse. Never have more than two 
"expedient" PUSHes in effect and never leave something PUSHed 
across a major header comment or for more than 15 instructions or 
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so. An exception is the saving and restoring of registers at 
entrance to and exit from a subroutine; here, if the subroutine is 


‘long, you should probably PUSH everything which the caller may need 


saved, whether you will use the register or not, and POP it in 
reverse order at the end. 


Be aware that CALL and INT push return address information on the 
stack and RET and IRET pop it off. It is a good idea to become 
familiar with the structure of the stack. 


In practice, to invoke system services you will use the INT 
instruction. It is quite possible to use this instruction effec- 
tively in a cookbook fashion without knowing precisely how it 
works. 


The transfer of control instructions (CALL, RET, JMP) deserve care- 
ful study to avoid confusion. You will learn that these can be 
classified as follows: 


1) all three have the capability of being either NEAR (CS register 
unchanged) or FAR (CS register changed) 


2) JMPs and CALLs can be DIRECT (target is assembled into instruc- 
tion) or INDIRECT (target fetched from memory or register) 


3) if NEAR and DIRECT, a JMP can be SHORT (less than 128 bytes 
away) or LONG 


In general, the third issue is not worth worrying about. On a for- 
ward jump which is clearly VERY short, you can tell the assembler 
it is short and save one byte of code: 


JMP SHORT CLOSEBY 


On a backward jump, the assembler can figure it out for you. Ona 
forward jump of dubious length, let the assembler default to a LONG 
form; at worst you waste one byte. 


Also leave the assembler to worry about how the target address is 
to be represented, in absolute form or relative form. 


The conditional jump set is rather confusing when studied apart 
from the assembler, but you do need to get a feeling for it. The 
interactions of the sign, carry, and overflow flags can get your 
mind stuttering pretty fast if you worry about it too much. What 
is boils down to, though, is 


JZ means what it says 

JINZ means what it says 

JG reater this means "if the SIGNED difference is positive" 
JA bove this means "if the UNSIGNED difference is positive" 
JL ess this means "if the SIGNED difference is negative" 
JB elow this means "if the UNSIGNED difference is negative!" 
JC arry assembles the same as JB; it's an aesthetic choice 
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You should understand that all conditional jumps are inherently 
DIRECT, NEAR, and "short"; the "short" part means that they can't 
go more than 128 bytes in either direction. Again, this is some- 
thing you could easily imagine to be more of a problem than it is. 
I follow this simple approach: 


1) When taking an abnormal exit from a block of code, I always use 
an unconditional jump. Who knows how far you are going to end 
up jumping by the time the program is finished. For example, I 
wouldn't code this: 


TEST AL, IDIBIT ;Is the idiot bit on? 
JNZ OYVEY ;Yes. Go to general cleanup 


Rather, I would probably code this: 


TEST AL,IDIBIT ;Is the idiot bit on? 

JZ NOIDIOCY ;No. I am saved. 

JMP OYVEY ;Yes. What can we say... 
NOIDIOCY: 


The latter, of course, is a jump around a jump. Some would say 
it is evil, but I submit it is hard to avoid in this language. 


2) Otherwise, within a block of code, I use conditional jumps 
freely. If the block eventually grows so long that the assem- 
bler starts complaining that my conditional jumps are too long 
i 


a) consider reorganizing the block but 
b) also consider changing some conditional jumps to their 
opposite and use the "jump around a jump" approach as shown 


above. 


Enough about specific instructions! 


6. Finally, in order to use the assembler effectively, you need to know 


the default rules for which segment registers are used to complete 
addresses in which situations. 


a. CS is used to complete an address which is the target of a NEAR 
DIRECT jump. On an NEAR INDIRECT jump, DS is used to fetch the 
address from memory but then CS is used to complete the address 
thus fetched. On FAR jumps, of course, CS is itself altered. The 
instruction counter is always implicitly pointing in the code seg- 
ment. 


b. SS is used to complete an address if BP is used in its formation. 
Otherwise, DS is always used to complete a data address. 


c. On the string instructions, the target is always formed from ES and 


DI. The source is normally formed from DS and SI. If there is a 
segment prefix, it overrides the source not the target. 
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Learning about DOS 


I think the best way to learn about DOS internals is to read the technical 

appendices in the manual. These are not as complete as we might wish, but 

they really aren't bad; I certainly have learned a lot from them. What you 
don't learn from them you might eventually learn via judicious disassembly 

of parts of DOS, but that shouldn't really be necessary. 


From reading the technical appendices, you learn that interrupts 20H 
through 27H are used to communicate with DOS. Mostly, you will use inter- 
rupt 21H, the DOS function manager. 


The function manager implements a great many services. You request the 
individual services by means of a function code in the AH register. For 
example, by putting a nine in the AH register and issuing interrupt 21H you 
tell DOS to print a message on the console screen. 


Usually, but by no means always, the DX register is used to pass data for 
the service being requested. For example, on the print message service 
just mentioned, you would put the 16 bit address of the message in the DX 
register. The DS register is also implicitly part of this argument, in 
keeping with the universal segmentation rules. 


In understanding DOS functions, it is useful to understand some history and 
also some of the philosophy of MS-DOS with regard to portability. General- 
ly, you will find, once you read the technical information on DOS and also 
the IBM technical reference, you will know more than one way to do almost 
anything. Which is best? For example, to do asynch adapter I/0, you can 
use the DOS calls (pretty incomplete), you can use BIOS, or you can go 
directly to the hardware. The same thing is true for most of the other 
primitive I/O (keyboard or screen) although DOS is more likely to give you 
added value in these areas. When it comes to file I/O, DOS itself offers 
more than one interface. For example, there are four calls which read data 
from a file. 


The way to decide rationally among these alternatives is by understanding 
the tradeoffs of functionality versus portability. Three kinds of porta- 
bility need to be considered: machine portability, operating system porta- 
bility (for example, the ability to assemble and run code under CP/M 86) 
and DOS version portability (the ability for a program to run under older 
versions of DOS>. 


Most of the functions originally offered in DOS 1.0 were direct descendents 
of CP/M functions; there is even a compatibility interface so that programs 
which have been translated instruction for instruction from 8080 assembler 
to 8086 assembler might have a reasonable chance of running if they use 
only the core CP/M function set. Among the most generally useful in this 
original compatibility set are 
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09 -- print a full message on the screen 


OA -- get a console input line with full DOS editing 
OF -- open a file 

10 -- close a file (really needed only when writing) 
11 -- find first file matching a pattern 

12 -- find next file matching a pattern 

13 -- erase a file 

16 -- create a file 

17 -~ rename a file 

1A -- set disk transfer address 


The next set provide no function above what you can get with BIOS calls or 
more specialized DOS calls. However, they are preferable to BIOS calls 
when portability is an issue. 


00 -- terminate execution 

01 -- read keyboard character 

02 -- write screen character 

03 -- read COM port character 

04 -- write COM port character 

05 -- print a character 

06 -- read keyboard or write screen with no editing 


The standard file I/O calls are inferior to the specialized DOS calls but 
have the advantage of making the program easier to port to CP/M style sys- 
tems. Thus they are worth mentioning: 


14 -- sequential read from file 
15  -- sequential write to file 
21 -- random read from file 

22 -- random write to file 

23 ~~ determine file size 

24 -- set random record 


In addition to the CP/M compatible services, DOS also offers some special- 
ized services which have been available in all releases of DOS. These 
include 


27 -- multi-record random read. 
28 -- multi-record random write. 
29 -- parse filename 


2A-2D -- get and set date and time 


All of the calls mentioned above which have anything to do with files make 
use of a data area called the "FILE CONTROL BLOCK" (FCB). The FCB is any- 
where from 33 to 37 bytes long depending on how it is used. You are 
responsible for creating an FCB and filling in the first 12 bytes, which 
contain a drive code, a file name, and an extension. 


When you open the FCB, the system fills in the next 20 bytes, which 
includes a logical record length. The initial lrecl is always 128 bytes, 
to achieve CP/M compatibility. The system also provides other useful 
information such as the file size. 
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After you have opened the FCB, you can change the logical record length. 

If you do this, your program is no longer CP/M compatible, but that doesn't 
make it a bad thing to do. DOS documentation suggests you use a logical 
record length of one for maximum flexibility. This is usually a good 
recommendation. 


To perform actual 1/0 to a file, you eventually need to fill in byte 33 or 
possibly bytes 34-37 of the FCB. Here you supply information about the 
record you are interested in reading or writing. For the most part, this 


‘part of the interface is compatible with CP/M. 


In general, you do not need to (and should not) modify other parts of the 
FCB. 


The FCB is pretty well described in appendix E of the DOS manual. 


Beginning with DOS 2.0, there is a whole new system of calls for managing 
files which don't require that you build an FCB at all. These calls are 
quite incompatible with CP/M and also mean that your program cannot run 
under older releases of DOS. However, these calls are very nice and easy 
to use. They have these characteristics 


1. To open, create, delete, or rename a file, you need only a character 
string representing its name. 


2. The open and create calls return a 16 bit value which is simply placed 
in the BX register on subsequent calls to refer to the file. 


3. There is not a separate call required to specify the data buffer. 


4. Any number of bytes can be transfered on a single call; no data area 
must be manipulated to do this. 


The "new" DOS calls also include comprehensive functions to manipulate the 
new chained directory structure and to allocate and free memory. 


Learning the assembler 


It is my feeling that many people can teach themselves to use the assembler 
by reading the MACRO Assembler manual if 


1. You have read and understood a book like Morse and thus have a feeling 
for the instruction set 


2. You know something about DOS services and so can communicate with the 
keyboard and screen and do something marginally useful with files. In 
the absence of this kind of knowledge, you can't write meaningful prac- 
tice programs and so will not progress. 


3. You have access to some good examples (the ones supplied with the 
assembler are not good, in my opinion. I will try to supply you with 
some more relevant ones. 
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4. You ignore the things which are most confusing and least useful. Some 
of the most confusing aspects of the assembler include the facilities 
combining segments. But, you can avoid using all but the simplest of 
these facilities in many cases, even while writing quite substantial 
applications. 


5. The easiest kind of assembler program to write is a COM program. They 
might seem harder, at first, then EXE programs because there is an 
extra step involved in creating the executable file, but COM programs 
are structurally very much simpler. 


At this point, it is necessary to talk about COM programs and EXE programs. 
As you probably know, DOS supports two kinds of executable files. EXE pro- 
grams are much more general, can contain many segments, and are generally 
built by compilers and sometimes by the assembler. If you follow the lead 
given by the samples distributed with the assembler, you will end up with 
EXE programs. A COM program, in contrast, always contains just one 
segment, and receives control with all four segment registers containing 
the same value. A COM program, thus, executes in a simplified environment, 
a 64K address space. You can go outside this address space simply by tem- 
porarily changing one segment register, but you don't have to, and that is 
the thing which makes COM programs nice and simple. Let's look at a very 
simple one. 


The classic text on writing programs for the C language says that the first 
thing you should write is a program which says 


HELLO, WORLD. 


when invoked. What's sauce for C is sauce for assembler, so let's start 
with a HELLO program of our own. My first presentation of this will be 
bare bones, not stylistically complete, but just an illustration of what an 
assembler program absolutely has to have: 


HELLO SEGMENT ;Set up HELLO code and data section 
ASSUME CS:HELLO,DS:HELLO ;Tell assembler about conditions at entry 
ORG 100H -;A .COM program begins with 100H byte prefix 
MAIN: JMP BEGIN ;Control must start here 
MSG DB 'Hello, world.$' ;But it is generally useful to put data first 
BEGIN: MOV DX,OFFSET MSG sLet DX --> message. 
MOV AH,9 ;Set DOS function code for printing a message 
INT 21H ;Invoke DOS . 
RET ;Return to system 
HELLO ENDS ;End of code and data section 


END MAIN ;Terminate assembler and specify entry point 
First, let's attend to some obvious points. The macro assembler uses the 
general form 
name opcode operands 
Unlike the 370 assembler, though, comments are NOT set off from operands by 


blanks. The syntax uses blanks as delimiters within the operand field (see 
line 6 of the example) and so all comments must be set off by semi-colons. 
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Line comments are frequently set off with a semi-colon in column 1. I use 
this approach for block comments too, although there is a COMMENT statement 
which can be used to introduce a block comment. 


Being an old 370 type, I like to see assembler code in upper case, although 
my comments are mixed case. Actually, the assembler is quite happy with 
mixed case anywhere. 


As with any assembler, the core of the opcode set consists of opcodes which 
generate machine instructions but there are also opcodes which generate 
data and ones which function as instructions to the assembler itself, some- 
times called pseudo-ops. In the example, there are five lines which gener- 
ate machine code (JMP, MOV, MOV, INT, RET), one line which generates data 
(DB) and five pseudo-ops (SEGMENT, ASSUME, ORG, ENDS, and END). 


We will discuss all of them. 


Now, about labels. You will see that some labels in the example end in a 
colon and some don't. This is just a bit confusing at first, but no real 
mystery. If a label is attached to a piece of code (as opposed to data), 
then the assembler needs to know what to do when you JMP to or CALL that 
label. By convention, if the label ends in a colon, the assembler will use 
the NEAR form of JMP or CALL. If the label does not end in a colon, it 
will use the FAR form. In practice, you will always use the colon on any 
label you are jumping to inside your program because such jumps are always 
NEAR; there is no reason to use a FAR jump within a single code section. I 
mention this, though, because leaving off the colon isn't usually trapped 
as a syntax error, it will generally cause something more abstruse to go 
wrong. 


On the other hand, a label attached to a piece of data or a pseudo-op never 
ends in a colon. 


Machine instructions will generally take zero, one or two operands. Where’ 
there are two operands, the one which receives the result goes on the left 


as in 370 assembler. 


I tried to explain this before, now maybe it will be even clearer: there 


_ are many more 8086 machine opcodes then there are assembler opcodes to rep- 


resent them. For example, there are five kinds of JMP, four kinds of CALL, 
two kinds of RET, and at least five kinds of MOV depending on how you count 
them. The macro assembler makes a lot of decisions for you based on the 
form taken by the operands or on attributes assigned to symbols elsewhere 
in your program. In the example above, the assembler will generate the 
NEAR DIRECT form of JMP because the target label BEGIN labels a piece of 
code instead of a piece of data (this makes the JMP DIRECT) and ends in a 
colon (this makes the JMP NEAR). The assembler will generate the immediate 
forms of MOV because the form OFFSET MSG refers to immediate data and 
because 9 is a constant. The assembler will generate the NEAR form of RET 
because that is the default and you have not told it otherwise. 


The DB (define byte) pseudo-op is an easy one: it is used to put one or 
more bytes of data into storage. There is also a DW (define word) 
pseudo-op and a DD (define doubleword) pseudo-op; in the PC MACRO assem- 
bler, the fact that a label refers to a byte of storage, a word of storage, 
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or a doubleword of storage can be very significant in ways which we will 
see presently. 


About that OFFSET operator, I guess this is the best way to make the point 
about how the assembler decides what instruction to assemble: an analogy 
with 370 assembler: 


PLACE DC Pn Te 
LA R1,PLACE 
L Rl, PLACE 


In 370 assembler, the first instruction puts the address of label PLACE in 
register 1, the second instruction puts the contents of storage at label 
PLACE in register 1. Notice that two different opcodes are used. In the 
PC assembler, the analogous instructions would be 


PLACE DW kee 
MOV DxX,OFFSET PLACE 
MOV DxX,PLACE 


If PLACE is the label of a word of storage, then the second instruction 
will be understood as a desire to fetch that data into DX. If X isa 


- label, then "OFFSET X" means "the ordinary number which represents X's off- 


set from the start of the segment." And, if the assembler sees an ordinary 
number, as opposed to a label, it uses the instruction which is equivalent 
to LA. 


If PLACE were the label of a DB pseudo-op, instead of a DW, then 
MOV DX,PLACE 


would be illegal. The assembler worries about length attributes of its 
operands. 


Next, numbers and constants in general. The assembler's default radix is 
decimal. You can change this, but I don't recommend it. If you want to 
represent numbers in other forms of notation such as hex or bit, you gener- 
ally use a trailing letter. For example, 


21H 
is hexidecimal 21, 
00010000B 
is the eight bit binary number pictured. 


The next elements we should point to are the SEGMENT...ENDS pair and the 
END instruction. Every assembler program has to have these elements. 


SEGMENT tells the assembler you are starting a section of contiguous mate- 
rial (code and/or data). The symmetrically named ENDS statement tells the 
assembler you are finished with a section of contiguous material. I wish 
they didn't use the word SEGMENT in this context. To me, a "segment" is a 
hardware construct: it is the 64K of real storage which becomes address- 
able by virtue of having a particular value in a segment register. Now, it 
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is true that the "segments" you make with the assembler often correspond to 
real hardware "segments" at execution time. But, if you look at things 
like the GROUP and CLASS options supported by the linker, you will discover 
that this correspondence is by no means exact. So, at risk of maybe con- 
fusing you even more, I am going to use the more informal term "section" to 
refer to the area set off by means of the SEGMENT and ENDS instructions. 


The sections delimited by SEGMENT...ENDS pairs are really a lot like CSECTs 
and DSECTs in the 370 world. 


I strongly recommend that you be selective in your study of the SEGMENT 
pseudo-op as described in the manual. Let me just touch on it here. 


name SEGMENT 
name SEGMENT PUBLIC 
name SEGMENT AT nnn 


Basically, you can get away with just the three forms given above. The 
first form is what you use when you are writing a single section of assem- 
bler code which will not be combined with other pieces of code at link 
time. The second form says that this assembly only contains part of the 
section; other parts might be assembled separately and combined later by 
the linker. 


I have found that one can construct reasonably large modular applications 
in assembler by simply making every assembly use the same segment name and 
declaring the name to be PUBLIC each time. If you read the assembler and 
linker documentation, you will also be bombarded by information about more 
complex options such as the GROUP statement and the use of other "combine 
types" and "classes." I don't recommend getting into any of that. I will 
talk more about the linker and modular construction of programs a little 
later. The assembler manual also implies that a STACK segment is required. 
This is not really true. There are numerous ways to assure that you have a 
valid stack at execution time. 


Of course, if you plan to write applications in assembler which are more 
than 64K in size, you will need more than what I have told you; but who is 
really going to do that? Any application that large is likely to be coded 
in a higher level language. 


The third form of the SEGMENT statement makes the delineated section into 
something like a "DSECT;" that is, it doesn't generate any code, it just 
describes what is present somewhere already in the computer's memory. 
Sometimes the AT value you give is meaningful. For example, the BIOS work 
area is located at location 40 hex. So, you might see 


BIOSAREA SEGMENT AT 40H ;Map BIOS work area 
ORG BIOSAREA+10H 
EQUIP DB ? ;Location of equipment flags, first byte 


BIOSAREA ENDS 
in a program which was interested in mucking around in the BIOS work area. 
At other times, the AT value you give may be arbitrary, as when you are 


mapping a repeated control block: 
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PROGPREF SEGMENT AT 0 ;Really a DSECT mapping the program prefix 
ORG PROGPREF+6 
MEMSIZE DW ? ;Size of available memory 


PROGPREF ENDS 


Really, no matter whether the AT value represents truth or fiction, it is 
your responsibility, not the assembler's, to get set up a segment register 
so that you can really reach the storage in question. So, you can't say 


MOV AL,EQUIP 
unless you first say something like 


MOV AX,BIOSAREA ;BIOSAREA becomes a symbol with value 40H 
MOV ES,AX 
ASSUME ES:BIOSAREA 


Enough about SEGMENT. The END statement is simple. It goes at the end of 
every assembly. When you are assembling a subroutine, you just say 


END 
but when you are assembling the main routine of a program you say 
END label 
where 'label' is the place where execution is to begin. 


Another pseudo-op illustrated in the program is ASSUME. ASSUME is like the 
USING statement in 370 assembler. However, ASSUME can ONLY refer to seg- 
ment registers. The assembler uses ASSUME information to decide whether to 
assemble segment override prefixes and to check that the data you are try- 
ing to access is really accessible. In this case, we can reassure the 
assembler that both the CS and DS registers will address the section called 
HELLO at execution time. Actually, the SS and ES registers will too, but 


. the assembler never needs to make use of this information. 


I guess I have explained everything in the program except that ORG 
pseudo-op. ORG means the same thing as it does in many assembly languages. 
It tells the assembler to move its location counter to some particular 
address. In this case, we have asked the assembler to start assembling 
code hex 100 bytes from the start of the section called HELLO instead of at 
the very beginning. This simply reflects the way COM programs are loaded. 
When a COM program is loaded by the system, the system sets up all four 
segment registers to address the same 64K of storage. The first 100 hex 
bytes of that storage contains what is called the program prefix; this area 
is described in appendix E of the DOS manual. Your COM program physically 
begins after this. Execution begins with the first physical byte of your 
program; that is why the JMP instruction is there. 


Wait a minute, you say, why the JMP instruction at all? Why not put the 
data at the end? Well, in a simple program like this I probably could have 
gotten away with that. However, I have the habit of putting data first and 
would encourage you to do the same because of the way the assembler has of 
assembling different instructions depending on the nature of the operand. 
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Unfortunately, sometimes the different choices of instruction which can 
assemble from a single opcode have different lengths. ‘If the assembler has 
already seen the data when it gets to the instructions it has a good chance 
of reserving the right number of bytes on the first pass. If the data is 
at the end, the assembler may not have enough information on the first pass 
to reserve the right number of bytes for the instruction. Sometimes the 
assembler will complain about this, something like "Forward reference is 
illegal" but at other times, it will make some default assumption. On the 
second pass, if the assumption turned out to be wrong, it will report what 
is called a "Phase error," a very nasty error to track down. So get in the 
habit of putting data and equated symbols ahead of code. 


OK. Maybe you understand the program now. Let's walk through the steps 
involved in making it into a real COM file. 


1. The file should be created with the name HELLO.ASM (actually the name 
is arbitrary but the extension .ASM is conventional and useful) 
% 


ASM HELLO, , ; 


(this is just one example of invoking the assembler; it uses the small 
assembler ASM, it produces an object file and a listing file with the 
same name as the source file. I am not going exhaustively into how to 
invoke the assembler, which the manual goes into pretty well. I guess 
this is the first time I mentioned that there are really two 
assemblers; the small assembler ASM will run in a 64K machine and 
doesn't support macros. I used to use it all the time; now that I have 
a bigger machine and a lot of macro libraries I use the full function 
assembler MASM. You get both when you buy the package). 


3. If you issue DIR at this point, you will discover that you have 
acquired HELLO.OBJ (the object code resulting from the assembly) and 
HELLO.LST (a listing file). I guess I can digress for a second here 
concerning the listing file. It contains TAB characters. I have found 
there are two good ways to get it printed and one bad way. The bad way 
is to use LPT1: as the direct target of the listing file or to try 
copying the LST file to LPT1 without first setting the tabs on the 
printer. The two good ways are to either 


a. direct it to the console and activate the printer with CTRL-PRTSC. 
In this case, DOS will expand the tabs for you. 


b. direct to LPT1: but first send the right escape sequence to LPT1 to 
set the tabs every eight columns. I have found that on some early 
serial numbers of the IBM PC printer, tabs don't work quite right, 
which forces you to the first option. 


4. 
LINK HELLO; 
(again, there are lots of linker options but this is the simplest. It 
takes HELLO.OBJ and makes HELLO.EXE). HELLO.EXE? I thought we were 
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making a COM program, not an EXE program. Right. HELLO.EXE isn't 
really executable; its just that the linker doesn't know about COM pro- 
grams. That requires another utility. You don't have this utility if 
you are using DOS 1.0; you have it if you are using DOS 1.1 or DOS 2.0. 
Oh, by the way, the linker will warn you that you have no stack 
segment. Don't worry about it. 


EXEZBIN HELLO HELLO.COM 


This is the final step. It produces the actual program you will exe- 
cute. Note that you have to spell out HELLO.COM; for a nominally 
rational but actually perverse reason, EXE2BIN uses the default exten- 
sion BIN instead of COM for its output file. At this point, you might 
want to erase HELLO.EXE; it looks a lot more useful than it is. 
Chances are you won't need to recreate HELLO.COM unless you change the 
source and then you are going to have to redo the whole thing. 


HELLO 
You type hello, that invokes the program, it says 
HELLO YOURSELF!!! 


(oops, what did I do wrong....?) 


What about subroutines? 


I started with a simple COM program because I actually think they are easi- 
er to create than subroutines to be called from high level languages, but 
maybe its really the latter you are interested in. Here, I think you 
should get comfortable with the assembler FIRST with little exercises like 
the one above and also another one which I will finish up with. 


Next you are ready to look at the interface information for your particular 
language. You usually find this in some sort of an appendix. For example, 
the BASIC manual has Appendix C on Machine Language Subroutines. The 
PASCAL manual buries the information a little more deeply: the interface 
to a separately compiled routine can be found in the Chapter on Procedures 
and Functions, in a subsection called Internal Calling Conventions. 


Each language is slightly different, but here are what I think are some 
common issues in subroutine construction. 


1. NEAR versus FAR? Most of the time, your language will probably call 
your assembler routine as a FAR routine. In this case, you need to 
make sure the assembler will generate the right kind of return. You do 
this with a PROC...ENDP statement pair. The PROC statement is probably 
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a good idea for a NEAR routine too even though it is not strictly 
required: 
FAR linkage: NEAR linkage: 
ARBITRARY SEGMENT 
PUBLIC THENAME 
ASSUME CS : ARBITRARY 
THENAME PROC FAR 
reer code and data 
THENAME ENDP 
ARBITRARY ENDS 
END 


SPECIFIC SEGMENT PUBLIC 

PUBLIC THENAME 

ASSUME CS:SPECIFIC,DS:SPECIFIC 

ASSUME ES:SPECIFIC,SS:SPECIFIC 
THENAME PROC NEAR 

.. code and data .... 

THENAME ENDP 
SPECIFIC ENDS 

END 


With FAR linkage, it doesn't really matter what you call the segment. 
you must declare the name by which you will be called in a PUBLIC pseu- 
do-op and also show that it is a FAR procedure. Only CS will be ini- 
tialized to your segment when you are called. Generally, the other 
segment registers will continue to point to the caller's segments. 


With NEAR linkage, you are executing in the same segment as the caller. 
Therefore, you must give the segment a specific name as instructed by 
the language manual. However, you may be able to count on all segment 
registers pointing to your own segment (sometimes the situation can be 
more complicated but I cannot really go into all of the details). You 
should be aware that the code you write will not be the only thing in 
the segment and will be physically relocated within the segment by the 
linker. However, all OFFSET references will be relocated and will be 
correct at execution time. 


2. Parameters passed on the stack. Usually, high level languages pass 
parameters to subroutines by pushing words onto the stack prior to 
calling you. What may differ from language to language is the nature 
of what is pushed (OFFSET only or OFFSET and SEGMENT) and the order in 
which it is pushed (left to right, right to left within the CALL state- 
ment). However, you will need to study the examples to figure out how 
to retrieve the parameters from the stack. A useful fact to exploit is 
the fact that a reference involving the BP register defaults to a ref- 
erence to the stack segment. So, the following strategy can work: 

ARGS STRUC 
DW 3 DUP(?) ;Saved BP and return address 

ARG3 DW ? 

ARG2 DW ? 

ARG1 DW ? 

ARGS ENDS 
PUSH BP ;save BP register 
MOV BP,SP ;Use BP to address stack 
MOV ...,{[ BP] .ARG2 ;retrieve second argument 
(etc.) 
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This example uses something called a structure, which is only available 
in the large assembler; furthermore, it uses it without allocating it, 
which is not a well-documented option. However, I find the above 
approach generally pleasing. The STRUC is like a DSECT in that it 
establishes labels as being offset a certain distance from an arbitrary 
point; these labels are then used in the body of code by beginning them 
with a period; the construction ".ARG2" means, basically, " + 
(ARG2-ARGS)." 


What you are doing here is using BP to address the stack, accounting 
for the word where you saved the caller's BP and also for the two words 
which were pushed by the CALL instruction. 


How big is the stack? BASIC only gives you an eight word stack to play 
with. On the other hand, it doesn't require you to save any registers 
except the segment registers. Other languages give you a liberal 
stack, which makes things a lot easier. If you have to create a new 
stack segment for yourself, the easiest thing is to place the stack at 
the end of your program and: 


CLI ;suppress interrupts while changing the stack 


MOV SSAVE,SS ;save old SS in local storage (old SP 
; already saved in BP) 

MOV SP,CS ;switch 

MOV SS,SP ; the 

MOV SP,OFFSET STACKTOP ;stack 

STI ; (maybe) 


Later, you can reverse these steps before returning to the caller. At 
the end of your program, you place the stack itself: 


DW 128 DUP(?) ;stack of 128 words (liberal) 
STACKTOP LABEL WORD 


Make sure you save and restore those registers required by the caller. 


Be sure to get the right kind of addressibility. In the FAR call exam- 
ple, only CS addresses your segment. If you are careful with your 
ASSUME statements the assembler will keep track of this fact and gener- 
ate CS prefixes when you make data references; however, you might want 
to do something like 


MOV AX,CS ;get current segment address 
MOV DS ,AX ;To DS 
ASSUME DS:THISSEG 


Be sure you keep your ASSUMEs in synch with reality. 
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Learning about BIOS and the hardware 


You can't do everything with DOS calls. You may need to learn something 
about the BIOS and about the hardware itself. In this, the Technical Ref- 
erence is a very good thing to look at. 


The first thing you look at in the Technical Reference, unless you are 
really determined to master the whole ball of wax, is the BIOS listings 
presented in Appendix A. Glory. be: here is the whole 8K of ROM which deals 
with low level hardware support layed out with comments and everything. 


In fact, if you are just interested in learning what BIOS can do for you, 
you just need to read the header comments at the beginning of each section 
of the listing. 


BIOS services are invoked by means of the INT instruction; the BIOS occu- 
pies interrupts 10H through 1FH and also interrupt 5H; actually, of these 
seventeen interrupts, five are used for user exit points or data pointers, 
leaving twelve actual services. 


In most cases, a service deals with a particular hardware interface; for 
example, BIOS interrupt 10H deals with the screen. As with DOS function 
calls, many BIOS services can be passed a function code in the AH register 
and possible other arguments. 


I am not going to summarize the most useful BIOS features here; you will 
see some examples in the next sample program we will look at. 


The other thing you might want to get into with the Tech reference is the 
description of some hardware options, particularly the asynch adapter, 
which are not well supported in the BIOS. The writeup on the asynch adapt- 
er is pretty complete. 


Actually, the Tech reference itself is pretty complete and very nice as far 
as it goes. One thing which is missing from the Tech reference is informa- 
tion on the programmable peripheral chips on the system board. These 
include 


the 8259 interrupt controller 
the 8253 timer 

the 8237 DMA controller and 
the 8255 peripheral interface 


To make your library absolutely complete, you should order the INTEL data 
sheets for these beasts. 


I should say, though, that the only I ever found I needed to know about was 
the interrupt controller. If you happen to have the 8086 Family User's 
Manual, the big book put out by INTEL, which is one of the things people 
sometimes buy to learn about 8086 architecture, there is an appendix there 
which gives an adequate description of the 8259. 
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A final example 


I leave you with a more substantial example of code which illustrates some 
good elementary techniques; I won't claim its style is perfect, but I think 
it is adequate. I think this is a much more useful example than what you 
will get with the assembler: 


PAGE 61,132 
TITLE SETSCRN -- Establish correct monitor use at boot time 


; This program is a variation on many which toggle the equipment flags 

; to support the use of either video option (monochrome or color). 

; The thing about this one is it prompts the user in such a way that he 
; can select the use of the monitor he is currently looking at (or which 
: is currently connected or turned on) without really having to know 


; which is which. SETSCRN is a good program to put first in an 
7 AUTOEXEC.BAT file. 
; This program is highly dependent on the hardware and BIOS of the IBMPC 


; and is hardly portable, except to very exact clones. For this reason, 
: BIOS calls are used in lieu of DOS function calls where both provide 
; equal function. 


OK. That's the first page of the program. Notice the PAGE statement, 
which you can use to tell the assembler how to format the listing. You 
give it lines per page and characters per line. I have mine setup to print 
on the host lineprinter; I routinely upload my listings at 9600 baud and 
print them on the host; it is faster than using the PC printer. 


There is also a TITLE statement. This simply provides a nice title for 
each page of your listing. Now for the second page: 


SUBTTL -- Provide .COM type environment and Data 


PAGE 
: First, describe the one BIOS byte we are interested in 
BIOSDATA SEGMENT AT 40H ;Describe where BIOS keeps his data 
ORG 10H ;Skip parts we are not interested in 
EQUIP DB ? ;Equipment flag location 
MONO EQU O00110000B ;These bits on if monochrome 
COLOR EQU 11101111B ;Mask to make BIOS think of the color board 


BIOSDATA ENDS ;End of interesting part 


, 


: Next, describe some values for interrupts and functions 


DOS EQU 21H 


;DOS Function Handler INT code 
PRIMSG EQU 09H ;Function code to print a message 
KBD EQU 16H ;BIOS keyboard services INT code 
GETKEY EQU 00H ;Function code to read a character 
SCREEN EQU 10H ;BIOS Screen services INT code 
- MONOINIT EQU 02H ;Value to initialize monochrome screen 
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;COLORINIT EQU 03H 
COLORINIT EQU 01H 


, 


: Now, describe our own segment 


:Value to initialize color screen (80x25) 
;Value to initialize color screen (40X25) 


SETSCRN SEGMENT ;Set operating segment for CODE and DATA 


‘ 


ASSUME CS:SETSCRN,DS:SETSCRN,ES:SETSCRN,SS:SETSCRN ;All segments 
ORG 100H ;Begin assembly at standard .COM offset 

MAIN PROC NEAR ;COM files use NEAR linkage 
JMP BEGIN ;And, it is helpful to put the data first, but 


; ;then you must branch around it. 


Data used in SETSCRN 


o se 


CHANGELOC DD EQUIP ;Location of the EQUIP, recorded as far pointer 
MONOPROMPT DB 'Please press the plus ( + ) key.$! ;User sees on mono 
COLORPROMPT DB ‘Please press the minus ( - ) key.$'! ;User sees on color 


Several things are illustrated on this page. First, in addition to titles, 
the assembler supports subtitles: hence the SUBTTL pseudo-op. Second, the 
PAGE pseudo-op can be, used to go to a new page in the listing. You see an 
example here of the DSECT-style segment in the "SEGMENT AT 40H". Here, our 
our interest is in correctly describing the location of some data in the 
BIOS work area which really is located at segment 40H. 


You will also see illustrated the EQU instruction, which just gives a sym- 
bolic name to a number. I don't make a fetish of giving a name to every 
single number in a program. I do feel strongly, though, that interrupts 
and function codes, where the number is arbitrary and the function being 
performed is the thing of interest, should always be given symbolic names. 


One last new element in this section is the define doubleword (DD) instruc- 
tion. A doubleword constant can refer, as in this case, to a location in 
another segment. The assembler will be happy to use information at its 
disposal to properly assemble it. In this case, the assembler knows that 
EQUIP is offset 10 in the segment BIOSDATA which is at 40H. 


SUBTTL -- Perform function 


PAGE 

BEGIN: CALL MONOON ;Turn on mono display 
MOV DX,OFFSET MONOPROMPT ;GET MONO PROMPT 
MOV AH,PRTMSG ; ISSUE 
INT DOS ;1T 
CALL COLORON ;Turn on color display 
MOV DX,OFFSET COLORPROMPT ;GET COLOR PROMPT 
MOV AH, PRTMSG ; ISSUE 
INT DOS re a 
MOV AH,GETKEY ;Obtain user response 
INT KBD 
CMP AL +e! >Does he want MONO? 
JNZ NOMONO 
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PSE 


CALL MONOON ;yes. give it to him 
NOMONO : RET 
MAIN ENDP 


The main code section makes use of subroutines to keep the basic flow sim- 
ple. About all that's new to you in this section is the use of the BIOS 
interrupt KBD to read a character from the keyboard. 


Now for the subroutines, MONOON and COLORON: 


SUBTTL -- Routines to turn monitors on 
PAGE 
MONOON PROC NEAR ;Turn mono on 
LES DI , CHANGELOC ;Get location to change 
ASSUME ES:BIOSDATA ;TELL ASSEMBLER ABOUT CHANGE TO ES 
OR EQUIP , MONO 
MOV AX ,MONOINIT ;Get screen initialization value 
INT SCREEN ;Initialize screen 
RET 
MONOON ENDP 
COLORON PROC NEAR ;Turn color on 
LES DI , CHANGELOC ;Get location to change 
ASSUME ES :BIOSDATA ;TELL ASSEMBLER ABOUT CHANGE TO ES 
AND EQUIP,COLOR 
MOV AX, COLORINIT ;Get screen initialization value 
INT SCREEN ;Initialize screen 
RET 
COLORON ENDP 
SETSCRN ENDS ;End of segment 
END MAIN ;End of assembly; execution at MAIN 


The instructions LES and LDS are useful ones for dealing with doubleword 
addresses. The offset is loaded into the operand register and the segment 
into ES (for LES) or DS (for LDS). By telling the assembler, with an 
ASSUME, that ES now addresses the BIOSDATA segment, it is able to correctly 
asserble the OR and AND instructions which refer to the EQUIP byte. An ES 
segment prefix is added. 


To understand the action here, you simply need to know that flags in that 
particular byte control how the BIOS screen service initializes the adapt- 
ers. BIOS will only work with one adapter at a time; by setting the equip- 
ment flags to show one or the other as installed and calling BIOS screen 
initialization, we achieve the desired effect. . 


The rest is up to you. 
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